Robust Regression Analysis of Copy Number Variation Data based on a Univariate Score

نویسندگان

  • Glen A. Satten
  • Andrew S. Allen
  • Morna Ikeda
  • Jennifer G. Mulle
  • Stephen T. Warren
چکیده

MOTIVATION The discovery that copy number variants (CNVs) are widespread in the human genome has motivated development of numerous algorithms that attempt to detect CNVs from intensity data. However, all approaches are plagued by high false discovery rates. Further, because CNVs are characterized by two dimensions (length and intensity) it is unclear how to order called CNVs to prioritize experimental validation. RESULTS We developed a univariate score that correlates with the likelihood that a CNV is true. This score can be used to order CNV calls in such a way that calls having larger scores are more likely to overlap a true CNV. We developed cnv.beast, a computationally efficient algorithm for calling CNVs that uses robust backward elimination regression to keep CNV calls with scores that exceed a user-defined threshold. Using an independent dataset that was measured using a different platform, we validated our score and showed that our approach performed better than six other currently-available methods. AVAILABILITY cnv.beast is available at http://www.duke.edu/~asallen/Software.html.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzy Robust Regression Analysis with Fuzzy Response Variable and Fuzzy Parameters Based on the Ranking of Fuzzy Sets

‎Robust regression is an appropriate alternative for ordinal regression when outliers exist in a given data set‎. ‎If we have fuzzy observations‎, ‎using ordinal regression methods can't model them; In this case‎, ‎using fuzzy regression is a good method‎. ‎When observations are fuzzy and there are outliers in the data sets‎, ‎using robust fuzzy regression methods are appropriate alternatives‎....

متن کامل

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Multivariate and univariate analysis of genetic variation in Iranian summer savory (Satureja hortensis L.) accessions based on morphological traits

In order to evaluate the genetic variation in Iranian summer savory accessions, different accessions were analyzed using multivariate and univariate analysis. Results indicated that there were significant differences in some traits. The mean comparison analysis using least significant difference (LSD) test revealed significant differences among the accessions understudy. In this regard, the hig...

متن کامل

The analysis of residuals variation and outliers to obtain robust response surface

In this paper, the main idea is to compute the robust regression model, derived by experimentation, in order to achieve a model with minimum effects of outliers and fixed variation among different experimental runs. Both outliers and nonequality of residual variation can affect the response surface parameter estimation. The common way to estimate the regression model coefficients is the ordinar...

متن کامل

Variability and Correlation between the Seed Yield and its Component in Alfalfa (Medicago sativa L.) Populations under Dry Land Farming System, Hamadan, Iran

. In order to study the variation for seed yield and its components, 200 accessions of alfalfa (Medicago sativa L.) were sown as drilled plots, using alpha designs/unreplicated with two repeated entries with in all of 10 blocks under dry land farming system in Kabodarahang Research Station, Hamadan, Iran, during 2010 to 2011. Data were analyzed for descriptive statistics, correlation, regressio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014